On Testing the Missing at Random Assumption

نویسنده

  • Manfred Jaeger
چکیده

Most approaches to learning from incomplete data are based on the assumption that unobserved values are missing at random (mar). While the mar assumption, as such, is not testable, it can become testable in the context of other distributional assumptions, e.g. the naive Bayes assumption. In this paper we investigate a method for testing the mar assumption in the presence of other distributional constraints. We present methods to (approximately) compute a test statistic consisting of the ratio of two profile likelihood functions. This requires the optimization of the likelihood under no assumptions on the missingness mechanism, for which we use our recently proposed AI & M algorithm. We present experimental results on synthetic data that show that our approximate test statistic is a good indicator for whether data is mar relative to the given distributional assumptions.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Hardy Weinberg Equilibrium Testing and Interpretation: Focus on infection

Hardy-Weinberg equilibrium (HWE) holds when, in a closed population with random mating and without mutation and natural selection, genotype frequencies at any locus is a simple function of allele frequencies. Testing for HWE is now a common practice in population genetics and genetic association studies of non-communicable diseases; however, it is less-regarded, or sometimes miss-interpreted, i...

متن کامل

Collaborative Filtering and the Missing at Random Assumption

Rating prediction is an important application, and a popular research topic in collaborative filtering. However, both the validity of learning algorithms, and the validity of standard testing procedures rest on the assumption that missing ratings are missing at random (MAR). In this paper we present the results of a user study in which we collect a random sample of ratings from current users of...

متن کامل

Flow Shop Scheduling Problem with Missing Operations: Genetic Algorithm and Tabu Search

Flow shop scheduling problem with missing operations is studied in this paper. Missing operations assumption refers to the fact that at least one job does not visit one machine in the production process. A mixed-binary integer programming model has been presented for this problem to minimize the makespan. The genetic algorithm (GA) and tabu search (TS) are used to deal with the optimization...

متن کامل

Stage Life Testing with Missing Stage Information - an EM-Algorithm Approach

We consider a stage life testing model and assume that the information at which levels the failures occurred is not available. In order to find estimates for the lifetime distribution parameters, we propose an EM-algorithm approach which interprets the lack of knowledge about the stages as missing information. Furthermore, we illustrate the implementation difficulties caused by an increasing nu...

متن کامل

Testing for associations with missing high-dimensional categorical covariates.

Understanding how long-term clinical outcomes relate to short-term response to therapy is an important topic of research with a variety of applications. In HIV, early measures of viral RNA levels are known to be a strong prognostic indicator of future viral load response. However, mutations observed in the high-dimensional viral genotype at an early time point may change this prognosis. Unfortu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006